tutorials/001 - Introduction.ipynb (122 lines of code) (raw):
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"[](https://github.com/aws/aws-sdk-pandas)\n",
"\n",
"# 1 - Introduction"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"## What is AWS SDK for pandas?\n",
"\n",
"An [open-source](https://github.com/aws/aws-sdk-pandas) Python package that extends the power of [Pandas](https://github.com/pandas-dev/pandas) library to AWS connecting **DataFrames** and AWS data related services (**Amazon Redshift**, **AWS Glue**, **Amazon Athena**, **Amazon Timestream**, **Amazon EMR**, etc).\n",
"\n",
"Built on top of other open-source projects like [Pandas](https://github.com/pandas-dev/pandas), [Apache Arrow](https://github.com/apache/arrow) and [Boto3](https://github.com/boto/boto3), it offers abstracted functions to execute usual ETL tasks like load/unload data from **Data Lakes**, **Data Warehouses** and **Databases**.\n",
"\n",
"Check our [list of functionalities](https://aws-sdk-pandas.readthedocs.io/en/3.11.0/api.html)."
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"## How to install?\n",
"\n",
"awswrangler runs almost anywhere over Python 3.8, 3.9 and 3.10, so there are several different ways to install it in the desired environment.\n",
"\n",
" - [PyPi (pip)](https://aws-sdk-pandas.readthedocs.io/en/3.11.0/install.html#pypi-pip)\n",
" - [Conda](https://aws-sdk-pandas.readthedocs.io/en/3.11.0/install.html#conda)\n",
" - [AWS Lambda Layer](https://aws-sdk-pandas.readthedocs.io/en/3.11.0/install.html#aws-lambda-layer)\n",
" - [AWS Glue Python Shell Jobs](https://aws-sdk-pandas.readthedocs.io/en/3.11.0/install.html#aws-glue-python-shell-jobs)\n",
" - [AWS Glue PySpark Jobs](https://aws-sdk-pandas.readthedocs.io/en/3.11.0/install.html#aws-glue-pyspark-jobs)\n",
" - [Amazon SageMaker Notebook](https://aws-sdk-pandas.readthedocs.io/en/3.11.0/install.html#amazon-sagemaker-notebook)\n",
" - [Amazon SageMaker Notebook Lifecycle](https://aws-sdk-pandas.readthedocs.io/en/3.11.0/install.html#amazon-sagemaker-notebook-lifecycle)\n",
" - [EMR Cluster](https://aws-sdk-pandas.readthedocs.io/en/3.11.0/install.html#emr-cluster)\n",
" - [From source](https://aws-sdk-pandas.readthedocs.io/en/3.11.0/install.html#from-source)\n",
"\n",
"Some good practices for most of the above methods are:\n",
" - Use new and individual Virtual Environments for each project ([venv](https://docs.python.org/3/library/venv.html))\n",
" - On Notebooks, always restart your kernel after installations."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Let's Install it!"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"!pip install awswrangler"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"> Restart your kernel after the installation!"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'2.0.0'"
]
},
"execution_count": 1,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"import awswrangler as wr\n",
"\n",
"wr.__version__"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "awswrangler-v9JnknIF-py3.8",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.5"
},
"vscode": {
"interpreter": {
"hash": "83297b058d59ee0acd247586c837429190a8258f15c0eea6234359f5557dde51"
}
}
},
"nbformat": 4,
"nbformat_minor": 4
}